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1  Introduction 


When  a  scene  is  recorded  from  two  (or  more)  different  positions  in  space,  objects  are  projected 
into  different  locations  in  each  image.  The  disparity  in  position  between  the  two  images  may 
be  used  to  obtain  the  exact  coordinates  of  objects  if  tne  motion  of  the  camera  relative  to  each 
object  is  known.  This  view  of  motion  and  stereo  regards  vision  as  a  problem  of  inverse  optics, 
namely,  the  goal  is  to  find  the  inverse  transformation  of  the  optical  imaging  process  (perspective 
projection).  The  computation  is  usually  divided  into  two  main  steps.  The  first  is  correspon¬ 
dence:  matching  features  in  the  two  images  to  find  the  appropriate  disparity  in  position  for 
each  object  or  feature.  This  may  be  a  difficult  computation  for  many  image  pairs.  In  stereo 
in  particular  it  is  considered  the  heart  of  the  computational  problem  (e.g.,  [1]).  Henceforth  I 
will  assume  that  matching  is  given.  The  second  step  is  the  determination  of  the  motion  (or 
camera)  parameters  that  can  be  used  to  compute  the  distance  to  objects  in  space  using  geo¬ 
metrical  transformation.  This  is,  in  general,  a  very  difficult  computation.  I  will  discuss  some 
important  higher  level  goals  for  which  it  can  be  avoided.  For  these  limited  goals  solving  the 
second  subproblem  may  be  unnecessary. 

The  problem  of  computing  the  motion  parameters  from  motion  disparities  or  optical  flow 
(local  velocities)  has  received  much  attention.  The  corresponding  problem  of  camera  calibration 
in  stereo,  however,  is  often  ignored.  This  attention  is  often  motivated  by  the  assumption  that 
this  computation  is  a  p-erequisite  for  higher  level  tasks  such  as  navigation  or  recognition.  For 
example,  for  the  compu  ation  of  a  complete  3D  structure  from  motion  the  motion  parameters 
should  be  known.  Structure  from  motion  results  often  deal  mainly  with  the  minimal  number  of 
points  that  are  necessary  to  compute  the  inverse  transformation  (see  [2]).  For  this  purpose  it  has 
been  shown  that  7  or  8  matched  points  in  two  views  ([3]  and  [4])  or  5  points  and  their  velocities 
in  one  view  ([5])  are  sufficient.  The  actual  algorithms,  however,  are  typically  computationally 
expensive  and  sensitive  to  noise.  It  is  hard  to  guarantee  a  sufficiently  good  estimation  of  the 
motion  parameters  to  maintain  small  errors  in  the  structure  computation  (see  [6]). 

General  motion  can  be  decomposed  into  a  rotation  around  some  axis  followed  by  a  trans¬ 
lation.  In  a  similar  way  the  optical  flow  vector  can  be  decomposed  into  two  components:  one 
due  to  the  translation  component  of  the  motion  and  one  due  to  the  rotation  component.  In 
perspective  projection  and  if  the  motion  is  translation  only,  the  optical  flow  takes  a  very  simple 
form:  straight  lines  that  intersect  at  a  single  point,  the  focus  of  expansion  (FOE),  see  figure  1. 
This  point  is  the  projection  of  the  point  towards  which  (or  away  from  which)  the  camera's 
motion  is  directed.  If  the  motion  is  rotational  only,  the  flow  field  takes  the  form  of  concentric 
circles  (see  figure  1).  It  has  been  argued  that  if  we  can  identify  the  two  components  of  the  flow- 
field  then  the  problem  is  almost  solved,  the  direction  of  motion  and  relative  depth  of  all  points 
can  be  computed  from  the  translational  component  of  the  motion  (see  [7]). 

Because  of  the  practical  difficulties  in  devising  a  robust  algorithm  that  will  find  a  complete 
solution  of  the  problem,  the  need  for  a  more  qualitative  approach  to  motion  analysis  and  to 
vision  in  general  has  been  expressed  (e.g.,  [8],  [9]  and  [10]).  It  has  been  motivated  in  part  by 
the  experimentally  plausible  hypothesis  that  human  vision  does  not  compute  the  exact  inverse 
mapping  of  the  projection  of  a  3D  world  onto  a  2D  retina.  In  addition  for  many  purposes,  such 
as  navigation,  it  has  been  shown  that  the  complete  solution  of  the  motion  parameters  may  not 
be  necessary  (e.g.,  [11]  and  [12]).  The  computation  of  an  exact  3D  structure  may  not  even  be 
necessary  for  recognition.  The  exact  3D  coordinates  of  a  surface  do  not  seem  to  be  a  good 
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Figure  1:  An  example  of  an  optical  flow,  left:  translation  only,  right:  rotation  only. 

representation  for  either  storage  or  recognition  (see  [13]),  a  more  concise  representation  seems 
necessary.  The  sign  of  the  Gaussian  curvature  of  a  surface  is  one  candidate  to  be  a  part  of  a 
useful  representation  of  the  surface.  In  accordance  with  this  view,  Koenderink  and  van  Doom 
(see,  [14],  [15]  and  [16])  have  proposed  an  alternative  theor. ‘leal  approach  to  the  analysis  of 
stereo  and  motion  (assuming  matching  is  given).  They  show  how  various  qualitative  properties 
of  objects  and  the  motion  field  are  related  to  invariants  of  a  vector  field  (the  optical  flow  or 
stereo  disparity  field). 

In  this  work  I  will  discuss  some  motion  and  shape  characteristics  that  can  be  computed 
directly  from  motion  and  stereo  disparities  with  a  very  simple  operator.  It  is  not  necessary  to 
go  first  through  the  computationally  difficult  and  error  sensitive  process  of  recovering  the  exact 
function  of  the  surface  and  the  motion  parameters.  Thus  additional  errors  in  the  computation 
caused  by  using  motion  parameters  that  have  been  obtained  from  noisy  data  are  avoided.  It 
should  be  noted  that  the  computation  of  the  shape  features  discussed  here  is  not  immediate 
even  when  a  complete  3D  reconstructed  surface  is  given  (see  [13]  and  [17]). 

First,  the  sign  of  the  normal  curvature  of  a  curve  :  iurface  is  computed  from  following 
three  points  on  the  curve  that  are  collinear  in  one  innge.  If  the  points  remain  collinear  in 
the  other  image,  the  normal  curvature  is  0.  In  forwaru  jtion,  if  the  smaller  angle  created 
by  the  three  points  in  the  other  image  is  turned  towards  the  focus  of  expansion  (FOE),  the 
sign  is  negative.  If  the  smaller  angle  is  turned  away  from  the  FOE,  the  sign  is  positive.  In 
backward  motion  the  sign  reverses.  Note  that  the  direction  of  the  normal  to  the  surface  is  not 
needed  for  this  computation.  Although  perspective  projection  is  assumed,  otherwise  the  focus 
of  expansion  is  not  defined,  its  effects  on  motion  disparities  can  be  large  or  negligible  (in  the 
orthographic  projection  limit). 

Regardless  of  the  location  of  the  FOE,  this  simple  operator  can  be  computed  at  a  selected  set 
of  directions  around  a  point  to  determine  the  sign  of  the  Gaussian  curvature  of  a  local  surface 
patch,  an  intrinsic  property  of  the  surface.  From  this  analysis,  the  direction  of  the  translational 
component  of  the  motion  is  immediately  obtained.  From  this  component  it  is  possible  to  obtain 
the  focus  of  expansion  (FOE).  The  location  of  the  FOE  can  be  used  to  complete  the  classification 
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of  local  surface  patches  as  convex',  concave,  parabolic  (cylindrical),  hyperbolic  (saddle  point) 
or  planar.  In  addition,  the  directions  of  the  axes  of  zero  curvature,  and  hence  the  directions  of 
the  principal  axes,  are  also  immediately  obtained  from  this  computation.  The  analysis  does  not 
depend  upon  special  constraints  on  the  nature  of  objects  in  the  environment,  such  as  assuming 
smoothly  curved  surfaces  or  a  particular  analytic  representation  of  the  surface. 

The  rest  of  the  paper  is  organized  as  follows.  In  section  2  I  review  the  basic  differential 
geometry  concepts  of  normal  curvature  and  Gaussian  curvature  and  their  potential  usefulness 
for  object  representation.  In  section  3  I  show  how  surfaces  are  classified  and  the  focus  of 
expansion  is  computed  as  described  above.  In  stereo  the  ambiguity  of  a  region  with  positive 
Gaussian  curvature  can  be  resolved  without  additional  computations,  as  shown  in  section  3.3.  In 
section  4  I  show  that  the  simple  sign  operator  described  in  section  3  is  almost  as  accurate  in  the 
presence  of  noise  as  the  best  algorithm  that  uses  the  3D  coordinates  obtained  from  the  same 
noisy  data  and  using  perfect  motion  parameters  (i.e.  uncorrupted  inverse  transformation). 
Since  one  would  expect  the  noise  to  corrupt  the  motion  parameters  estimation  significantly, 
the  sign  algorithm  that  uses  2D  projections  directly  seems  to  be  more  robust.  In  section  5  I 
discuss  the  possible  relevance  of  these  results  to  biological  vision.  I  also  discuss  the  relation  to 
some  literature  about  structure  from  motion.  The  proofs  of  the  results  discussed  in  section  3 
are  given  in  the  appendix. 

2  Surface  curvature  and  its  importance  to  object  representa¬ 
tion 

The  normal  curvature  of  a  3D  curve  on  a  regular  surface  through  some  point  is  its  curvature 
with  respect  to  the  normal  to  the  surface.  That  is,  the  curve  is  projected  on  a  plane  that 
includes  the  normal  and  its  tangent  (a  normal  section)  and  the  curvature  of  the  projected 
planar  curve  is  the  normal  curvature  of  the  original  3D  curve,  see  figure  2.  The  curvature  of  a 
curve  relative  to  the  normal  to  the  surface  is  what  determines  the  curvature  of  the  surface.  For 
example,  if  all  normal  curvatures  are  negative,  namely  all  the  curves  are  convex  relative  to  the 
normal,  the  surface  is  convex.  If  all  are  concave,  the  surface  is  concave.  If  some  are  convex  and 
some  concave,  the  surface  is  hyperbolic,  i.e.  it  has  a  saddle  point. 

The  normal  curvature  of  all  the  curves  on  the  surface  through  some  point  can  be  written 
as  a  linear  combination  of  two  principal  curvatures  k\  and  k? .  These  are  the  curvatures  of  two 
perpendicular  curves  on  the  surface,  the  principal  axes,  that  obtain  the  extrema  of  the  normal 
curvatures  of  all  curves  on  the  surface  passing  through  the  same  point.  Let  «n  denote  the 
normal  curvature  of  some  curve  on  the  surface  that  makes  an  angle  9  with  the  first  principal 
axis.  Then 

Kn  =  «i  •  cos2  9  +  «2  •  sin2  9  .  (1) 

Thus  the  local  curvature  of  a  local  surface  patch  can  be  described  in  terms  of  two  numbers  only, 
«i  and  «2-  The  product  of  the  two  principal  curvatures  kj  ■  kj  is  called  the  Gaussian  curvature 
of  the  surface.  It  characterizes  the  surface  independently  of  the  environment. 

The  sign  of  the  Gaussian  curvature  locally  classifies  the  surface  as  follows: 

1.  elliptic  («i  •  «2  >  0), 
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tangent 

plane 


Figure  2:  The  normal  curvature  of  a  3D  curve  u  on  a  surface,  whose  tangent  through  P  is  w. 
Below  is  the  projection  of  the  curve  on  the  normal  section.  Left:  a  convex  example  (negative 
curvature),  right:  a  concave  example  (positive  curvature). 


Figure  3:  An  illustration  of  the  different  surface  types  used  for  classification  of  surfaces,  see 
text. 
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•  convex,  see  figure  3a-left  (k!,k:2  <  0) 

•  concave,  see  figure  3a-right  (ki,k2  >  0) 

2.  parabolic  (cylindrical),  see  figure  3b  ■  k2  =  0.  >  0  or  k2  <  0). 

3.  hyperbolic  (saddle  point),  see  figure  3c  (ki  •  k2  <  0,  i.e.  K\  >  0  and  k2  <  0), 

4.  planar  («i  •  k2  =  0,  sq  =  k2  =  0). 

It  follows  from  equation  (1)  that  the  number  of  asymptotes,  or  the  number  of  curves  on  the 
surface  with  zero-curvature,  determines  the  type  of  the  surface.  Namely, 

1.  elliptic:  no  asymptote, 

2.  parabolic:  one  asymptote, 

3.  hyperbolic:  two  asymptotes, 

4.  planar:  infinite  number  of  asymptotes. 

Thus  for  surfaces  w'here  the  asymptotes  are  locally  straight  lines  on  the  surface,  the  number  of 
straight  lines  on  the  surface  that  cross  a  point  will  determine  the  type  of  the  surface.  Various 
cues  like  intensity  gradients  (see  [18])  can  be  used  to  determine  whether  a  straight  line  in 
the  image  originated  from  a  straight  line  on  the  surface  (and  thus  of  zero-curvature).  Motion 
and  stereo  disparities  help  determine  the  sign  of  the  curvature  in  between  the  zero-curvature 
directions  which  is  necessary  for  surface  classification  (see  section  3). 

The  shape  of  most  objects  can  be  described  by  an  analytic  function  of  the  surface,  i.e. 
a  relative  depth  map.  For  purposes  of  storage  efficiency  and  recognition,  a  complete  depth 
map  seems  wasteful.  As  a  representation  it  is  sensitive  to  viewing  direction  and  noise;  it  is 
computationally  expensive  to  match  at  a  recognition  stage;  and  it  does  not  easily  generalize  to 
give  a  single  representation  for  similar  objects.  One  alternative  is  representing  the  shape  of  an 
object  as  a  collection  of  parts  where  each  part  is  described  by  a  few  surface  features.  Classifying 
regions  as  convex,  concave,  planar,  cylindrical,  or  hyperbolic  provides  one  important  intrinsic 
surface  feature.  This  classification  can  also  help  in  finding  part  boundaries  within  an  object 
(figure  4a)  that  occur  often  at  parabolic  lines.  Often  the  axes  of  principal  curvature  and  axes 
of  zero- curvature,  like  parabolic  lines  that  are  the  boundaries  between  different  surface  types, 
give  important  directions  on  the  surface  (figure  4b). 

3  Shape  classification 

3.1  Surface  curvature  and  FOE  from  motion  disparities 

Henceforth  perspective  projection  and  a  motion  with  nonzero  translational  component  are  as¬ 
sumed  so  that  the  focus  of  expansion  (see  section  1)  is  defined.  Under  these  conditions  the 
analysis  holds  at  the  orthographic  projection  limit  (that  is,  the  perspective  projection  has  neg¬ 
ligible  effect  on  the  disparities  yet  the  FOE  is  defined).  In  this  limit  the  motion  should  not  be 
translation  in  depth  only. 
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Figure  4:  Why  classify  surfaces:  a)  the  classification  may  help  divide  an  object  into  parts:  b) 
axes  of  principal  curvature  are  often  meaningful  curves  on  the  surface.  The  dashed  lines  are 
parabolic  lines. 


Proposition  1  Let  Pq  denote  a  point  on  the  surface  of  some  object  whose  projection  in  the 
first  image  is  0 o-  Let  Pj  and  P2  denote  two  other  points  on  the  same  surface  whose  projections 
in  the  first  image  are  0\  and  O2,  and  where  Oo,  0 1  and  0 2  are  collinear.  Let  0 0,  0\,  and  O2 
be  the  projections  of  the  same  three  points  in  a  second  image.  Assume  the  motion  is  backward 
(away  from  the  focus  of  expansion).  Then  the  sign  of  the  normal  curvature  of  the  curve  ( 
passing  through  Pq,  Pi,  and  P2  can  be  determined  as  follows: 

•  if  the  smaller  angle  through  Oo,  0\  and  O2  is  turned  towards  the  focus  of  expansion  then 
the  normal  curvature  of  £  is  positive  (see  figure  5a). 

•  if  Oo,  0\  and  0 2  are  collinear  then  the  normal  curvature  of  (  is  0  (see  figure  5b). 

•  if  the  smaller  angle  through  Oq,  0\  and  0 2  is  turned  away  from  the  focus  of  expansion 
then  the  normal  curvature  of  £  is  negative  (see  figure  5c). 

In  forward  motion  (towards  the  focus  of  expansion),  the  interpretation  of  the  angle  is  reversed. 
(The  motion  of  the  coordinate  system  is  defined  to  be  a  rotation  followed  by  a  translation.) 

A  proof  is  given  in  the  appendix.  It  consists  of  two  steps.  First,  it  is  shown  that  the  sign  of  the 
normal  curvature,  the  sign  of  a  curve’s  curvature  relative  to  the  normal  to  the  surface,  equals 
the  sign  of  the  curvature  relative  to  the  line  of  sight  in  the  first  image.  Thus  the  direction  of  the 
normal  is  not  needed  for  this  computation.  Second,  it  is  shown  that  the  sign  of  the  curvature 
relative  to  the  line  of  sight  equals  the  sign  of  the  curvature  relative  to  the  line  through  the  FOE 
and  the  curve  in  almost  any  2D  perspective  projection  of  the  curve,  e.g.  in  the  second  image. 

Figure  6  illustrates  the  implication  of  proposition  1.  In  a  concave  region,  three  collinear 
points  in  the  first  image  will  move  to  three  non-collinear  points  in  the  second  image  turning 
towards  the  focus  of  expansion. 

In  practice  I  compute  the  difference  of  the  slopes  of  the  line  segments  through  Oo  and  0 1 
and  through  O0  and  0 2,  angles  (3\  and  in  figure  5a.  Thus,  if  0,  =  (x;.,  y,),  the  sign  operator 
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Figure  5:  The  sign  of  the  normal  curvature  is  determined  by  the  relation  between  the  angle 
through  three  points  in  the  second  image,  that  are  collinear  in  the  first  image,  and  the  focus  of 
expansion.  Above  is  the  first  image,  Oo,  O i  and  O2  are  collinear.  Below  are  the  corresponding 
points  in  the  second  image  Oo,  0 1  and  O2'.  a)  the  normal  curvature  is  positive,  b)  the  normal 
curvature  is  0,  c)  the  normal  curvature  is  negative. 
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Figure  6:  In  a  concave  region,  collinear  points  (left)  move  to  noncollinear  points  (right)  that 
are  turning  towards  the  focus  of  expansion. 
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The  dependence  of  the  relation  between  the  sii^n  of  T  and  the  sign  of  the  normal  curvature 
on  the  location  of  the  FOK  is  summarized  in  the  following  proposition  (proof  is  given  in  the 
appendix): 

Proposition  2  Choose  0\  and  O 2  -so  that  they  are  collinear  with  Oq  and  lie  on  different  sides 
of  Oq.  Assume  txickward  motion  (the  motion  is  defined  now  as  a  translation  followed  by  a 
rotation).  If  0\  is  chosen  such  that  the  angle  through  0 0 o  and  the  FOE  going  clockwise  is 
smaller  than  1*0°.  that  is.  0\  is  above  the  sign-bisector  in  figure  5.  then  the  sign  of  T  equals 
the  sign  of  the  normal  curvature  off.  IfO\  is  chosen  so  that  the  angle  is  larger  than  1*0°  then 
the  sign  of  T  is  opposite  to  the  sign  of  the  normal  curvature  off.  If  the  angle  equals  1*(E  then 
the  sign  of  T  is  identically  0. 

One  result  of  proposition  2  is  that  if  0 \  is  chosen  around  Oo  in  all  orientations  between  0"  and 
360° .  the  correlation  between  the  sign  of  T  and  the  sign  of  the  normal  curvature  reverses  at 
the  orientation  where  0\.  Oq  and  the  FOE  are  collinear  (r0  in  figure  5).  The  direction  where 
T  changes  sign  will  be  used  later  to  compute  the  direction  of  the  translational  component  of 
the  motion  at  P0. 

Now  it  is  possible  to  classify  the  surface  near  a  point  P0  using  the  following  simple  algorithm: 
In  the  first  image,  for  each  direction  r  from  a  sample  set  of  directions  around  Oo  (see  upper 
part  of  figure  5)  choose  two  points  in  the  image  Ot  and  O 2  on  both  sides  of  O 0  so  that  they  are 
collinear  and  0 1  defines  a  slope  r.  It  is  assumed  that  0\  and  0 2  are  the  projections  of  points 
lying  on  the  same  surface  as  Pq.  Choose  Oi  at  all  orientations  r  around  Oq.  0 0  <  t  <  300°. 
Compute  T(r)  tor  all  r.  Then: 

•  T(r)  changes  sign  twice  (see  figure  7  above)  =>■  surface  is  elliptic. 

•  T(r)  changes  sign  twice  and  obtains  the  value  0  for  some  other  directions  r  and  r  +  ISO' 
without  changing  sign  surface  is  parabolic, 

•  T(r)  =  0  =>  surface  is  planar, 

•  T(r)  changes  sign  six  times  (see  figure  7  below)  =s  surface  is  hyperbolic. 

(the  sign  changes  four  times  at  ax^j  of  zero- curvature  and  twice  at  the  sign-bisector.) 

In  the  presence  of  noise,  some  threshold  should  be  used  instead  of  0,  which  may  cause  regions 
whose  curvature  is  low  to  be  classified  as  planar. 

The  sign  of  T(r)  is  ambiguous  when  the  location  of  the  FOE  is  not  known.  It  gives  the 
sign  of  the  normal  curvature  for  a  range  r0  <  r  <  r0  +  180°  for  some  r0  and  the  inverse  sign 
for  other  values  of  r.  The  direction  ro  is  denoted  sign-bisector  (see  figure  7).  It  is  the  direction 
where  T(r)  changes  sign  independently  of  the  normal  curvature. 

The  same  r0  gives  the  direction  of  the  translational  component  of  the  motion  at  P0.  This 
motion  component  can  be  used  to  obtain  the  focus  of  expansion  and  relative  depth.  In  the 
elliptic  case  it  is  the  only  direction  along  which  T(r)  changes  sign  (figure  7).  All  such  lines 
at  angles  r0(  P0)  for  different  points  P0  intersect  at  a  single  point  -  the  FOE  (see  figure  8).  In 
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Elliptic 


Figure  7:  The  sign  of  T(r)  depends  on  the  relative  position  of  0\  with  respect  to  00  and  the 
FOE.  In  the  figure,  the  circle  represents  possible  locations  of  the  +/-  inside  indicates  the 
sign  of  T (t)  whereas  the  sign  of  the  normal  curvature  is  given  in  parentheses.  Above  is  the 
elliptic  case,  below  is  the  hyperbolic  case. 
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the  hyperbolic  case.  T(r)  changes  sign  at  three  directions  (six  orientations),  as  is  illustrated  in 
figure  9  right.  Two  are  axes  of  zero-curvature  and  a  third  is  the  translational  component  of  the 
motion.  The  third  axis  of  sign  change  of  T(r)  at  all  the  points  intersect  at  the  FOE. 

The  location  of  the  FOE  can  be  used  to  complete  the  surface  classification  with  T(r)  if  0\ 
is  chosen  so  that  the  angle  between  Oi.  Oq  and  the  FOE  when  going  clockwise  is  smaller  than 
ISO'1.  The  classification  algorithm  is  now: 

•  T(r)  =  0  Vr  =>  surface  is  planar. 

•  Y(r)  >  0  Vr  =>  surface  is  concave. 

•  Y(r)  <  0  Vr  =>  surface  is  convex. 

•  Y(r)  >0  Vr  or  T(r)  <  0  Vr  =>  surface  is  parabolic  (cylindrical).  The  axis  of  zero 
curvature  is  the  axis  for  which  T(r)  =  0. 

•  T(r)  changes  sign  =>  surface  is  hyperbolic.  Tn  this  case  the  asymptotes  are  the  directions 
for  which  T(r)  =  0.  The  principal  directions  (direction  of  minimum  and  maximum 
curvature)  are  the  lines  that  cross  tne  two  angles  defined  by  the  asymptotes. 

Note  that  this  classification  is  done  without  the  computation  of  the  normal  to  the  surface. 

To  summarize,  by  computing  the  sign  of  T(r)  for  all  0°  <  r  <  360°  we  can  classify  a  surface 
as  elliptic,  hyperbolic,  planar,  or  parabolic.  At  each  point  we  also  obtain  the  direction  of  the 
translational  component  of  the  motion.  By  using  more  than  one  point  we  are  able  to  compute 
the  location  of  the  focus  of  expansion  and  thus  further  classify  an  elliptic  region  as  convex  or 
concave.  In  a  hyperbolic  region  we  obtain  at  each  point  three  axes,  two  of  which  are  axes  of 
zero-curvature  and  one  is  the  translational  component  of  the  motion.  From  the  two  axes  of 
zero  curvature  we  can  compute  the  principal  axes,  the  axes  of  minimal  and  maximal  curvature, 
that  are  the  two  angle  bisectors  of  the  two  axes  of  zero-curvature. 

3.2  Examples: 

Synthetic  objects  (a  sphere  and  a  torus)  have  been  classified  using  the  following  algorithm: 

For  each  pixel  (denoted  Po)  in  the  first  image  that  belong  to  the  object: 

1.  for  each  r  in  the  range  —90°  <  r  <  90°,  with  1°  increments: 

(a)  find  two  points  on  both  sides  of  P0  that  belong  to  the  object  and  so  that  the  three 
points  are  collinear  with  slope  r. 

(b)  find  the  coordinates  of  the  three  points  in  the  second  image  by  computing  the  motion 
transformation. 

(c)  compute  T (r). 

2.  count  the  number  of  zero-curvature  axes: 

(a)  count  the  number  of  zero-crossings  of  T(r). 

(b)  count  the  number  of  zero-touchings  of  T(r). 

(c)  add  the  two  numbers  and  subtract  1  (for  To,  see  figure  7). 
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(d)  save  the  zero-crossings  and  the  the  zero-touchings.  The  single  zero-crossing  in  the 
parabolic  and  elliptic  cases  is  the  translation  component  of  the  motion  at  P0.  The 
zero-touching  in  the  parabolic  case  is  the  axis  of  zero  curvature.  The  three  zero- 
crossings  in  the  hyperbolic  case  are  the  translation  component  of  the  motion  at  /(, 
and  the  two  axes  of  zero  curvature. 

3.  classify  P0  as  elliptic,  parabolic,  planar  or  hyperbolic  according  to  the  number  of  axes  of 
zero-curvature. 

-1.  Classify  further  an  elliptic  point: 

(a)  if  the  location  of  the  FOE  is  not  known  and  more  than  two  points  have  already  been 
analyzed,  compute  the  location  of  the  FOE.  Go  to  the  next  point  if  the  location 
of  the  FOE  is  not  known  or  if  it  is  not  known  whether  the  motion  is  backward  or 
forward. 

(b)  take  the  sign  of  T(r)  at  r  =  90°. 

(c)  reverse  the  sign  if  forward  motion. 

(d)  reverse  the  sign  if  the  x  coordinate  of  Pq  is  smaller  than  the  x  coordinate  of  the 
FOE. 

(e)  if  the  final  sign  is  negative  than  the  surface  is  convex,  otherwise  it  is  concave. 

The  first  example  is  a  synthetic  sphere.  The  motion  of  the  sphere  was  a  translation  of 
(2,  -2, 10),  rotation  of  15°  around  the  .Y-axis,  rotation  of  -20°  around  the  Y'-axis,  and  rotation 
of  5°  around  the  Z-axis.  The  center  of  the  sphere  was  initially  located  at  (0,0,50),  with  radius 
20.  It  had  moved  2.7°  of  arc.  The  zero-crossing  of  T(r),  i.e.  the  translation  component  of  the 
motion,  is  shown  in  figure  8  at  arbitrary  three  points  on  the  sphere.  The  three  zero  crossings 
intersect  at  the  FOE.  Figure  8  also  illustrates  the  resulting  classification:  all  the  points  on 
the  sphere  have  been  correctly  classified  as  convex  which  is  shown  by  the  particular  grey  level 
assigned  to  all  of  them. 

The  second  example  is  a  synthetic  torus.  The  motion  of  the  torus  was  the  same  as  that 
of  the  sphere.  The  center  of  the  torus  was  initially  located  at  (0,0,50),  with  large  radius  10 
and  small  radius  5.  The  zero-crossings  of  T(r)  are  shown  in  figure  8  at  arbitrary  four  points 
on  the  torus,  two  elliptic  points  and  two  hyperbolic  points.  Figure  9  illustrates  the  resulting 
classification:  the  torus  had  been  correctly  classified  as  being  composed  of  a  convex  region  on 
the  outside  and  a  hyperbolic  region  in  the  inside.  The  two  classes  are  marked  by  different  grey 
levels.  Note  the  emergence  of  the  parabolic  line  on  the  torus  (the  line  separating  the  hyperbolic 
region  from  the  ^  /o-ex  region,  whose  type  is  parabolic).  It  is  often  argued  that  these  parabolic 
lines  are  impcxr  .  for  image  representation  (see  [17]). 

3.3  Surface  c’’*.  v  dure  from  stereo  disparities 

With  genera,  ir.othm  we  had  to  know  the  location  of  the  focus  of  expansion  to  disambiguate 
completely  the  sign  of  T(r)  at  a  single  point.  The  least  we  had  to  do  was  to  repeat  the  analysis 
in  more  than  one  point  in  order  to  locate  the  focus  of  expansion.  This  computation  is  useful  by 
itself,  since  the  location  of  the  focus  of  expansion  is  important  for  other  purposes  like  navigation. 
However,  we  can  use  the  limited  knowledge  on  the  relative  location  of  the  two  cameras  that  is 
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Figure  8:  Classification  of  a  sphere  from  two  images  taken  in  motion,  see  text.  Left:  the  optical 
flow  vectors  do  not  intersect  and  do  not  reveal  much  about  the  motion.  Right:  the  translational 
components  of  the  motion  field  intersect  at  the  focus  of  expansion. 


Figure  9:  Classification  of  a  torus  from  two  images  taken  in  motion,  see  text.  The  final  clas¬ 
sification  is  shown  by  the  shading:  light  grey  for  hyperbolic  and  dark  grey  for  convex.  Left: 
the  optical  flow  vectors,  right:  the  zero-crossings  of  T(r):  the  axes  of  zero  curvature  and  the 
translation  component  of  the  motion. 


available  if  the  two  images  are  obtained  as  a  stereo  pair.  In  this  case  it  is  possible  to  obtain 
at  each  point  the  coordinates  up  to  a  scaling  factor  (like  in  perspective  projection)  in  a  new 
coordinate  system  whose  focus  of  expansion  is  fixed,  it  is  the  origin  of  the  coordinate  system  in 
either  of  the  cameras.  Thus  it  will  be  sufficient  to  apply  the  sign  operator  at  a  single  point  to 
be  able  to  classify  it  fully,  namely,  disambiguating  the  elliptic  case  to  convex  or  concave. 

I  make  the  following  assumptions:  given  two  cameras,  assume  that  the  principal  rays  inter¬ 
sect  at  a  fixation  point.  Assume  also  that  the  plane  that  passes  through  both  cameras  and  the 
fixation  point  includes  the  A'-axes  of  both  cameras.  The  following  coordinate  system  will  be 
used  (see  figure  10):  let  the  fixation  point  be  the  origin,  the  plane  through  the  origin  and  the 


Figure  10:  Above,  the  3D  coordinate  system  defined  by  two  cameras.  Below,  the  image  plane 
of  the  right  camera.  Point  O  is  the  projection  in  the  image  plane  of  the  3D  point  P .  Its  polar 
coordinates  R  and  6  are  shown. 

two  cameras  be  the  X  —  Z  plane,  and  the  line  perpendicular  to  this  plane  through  the  origin  be 
the  T-axis.  On  the  X  -  Z  plane,  the  principal  rays  of  both  cameras  intersect  at  the  origin  and 
create  an  angle  2/i  between  them.  Let  the  Z-axis  be  the  angle- bisector  of  2/r,  and  the  A'-axis 
perpendicular  to  the  Z-axis. 

Let  P  =  z(x,y,  1)  in  the  new  coordinate  system.  Let  (ffy,  i?/)  and  {Rr.  i)r)  be  the  projections 
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in  polar  coordinates  of  P  on  the  left  and  right  images  respectively  (see  figure  10).  Then  the 
following  holds  (see  [19]): 


x  —  tan  /i  ■ 


cot  l)r  +  cot  t)l 
cot  i)r  -  cot  9/ 


y  = 


2  sin  /i 

cot  i)r  —  cot  d; 


Now  the  "first"  image  in  the  previous  section  will  be  one  of  the  two  actual  images  and 
the  "second"  image  will  be  the  perspective  projection  in  the  coordinate  system  defined  above. 
Thus  the  focus  of  expansion  in  the  first  image  is  the  origin  of  the  camera.  The  sign  bisector 
at  direction  tq,  the  orientation  along  which  T(r)  changes  its  sign  regardless  of  the  sign  of  the 
normal  curvature,  is  the  line  connecting  Oq  to  the  origin.  Therefore  the  sign  of  T(r)  can  be 
directly  used  to  obtain  the  sign  of  the  normal  curvature.  For  convenience,  I  compute  T(r)  as  if 
the  perspective  projection  in  the  second  coordinate  system  is  on  the  X  -  Z  plane,  a  modification 
that  does  not  affect  any  of  the  underlying  arguments.  Thus, 

(cot  0}  -  cot  d°)  -  (cot  i)lr  -  cot  0°)  (cotd2-cot  dO)-(cotd?-cot  i)°) 

3t ^  (cot  dj  -  cot  d°)  -fi  (cot  dj  -  cot  d°)  (cotd^  -  cot  d°)  +  (cot  \)f  -  cot  d°) 

If  Oq  —  ( xo ,  j/o ),  then  r0  =  arctan  jf2-.  Thus  if  0\  =  (xi,i/i)  is  chosen  so  that  arctan  ^  < 
arctan  <  (arctan  ^  +  180°)  then  from  proposition  2  the  sign  of  Tst(r)  gives  the  sign  of 
the  normal  curvature  unambiguously.  The  same  algorithm  can. now  be  used  to  classify  surface 
patches  from  stereo  disparities. 

The  classification  algorithm  used  in  the  following  examples  is  as  in  section  3.2.  with  the 
following  difference: 


1.  T st(r)  is  computed  instead  of  T(r). 


2.  the  sign  of  T(r)  at  r  =  90°  needs  to  be  reversed  only  if  the  signs  of  the  x-  and  y-coordinates 
of  Pq  are  opposite  (here  we  use  the  fact  that  the  effective  FOE  is  located  at  the  origin  of 
the  coordinate  systems). 

3.  using  the  origin  as  the  FOE,  the  zero-crossings  of  Ta<(r)  that  correspond  to  the  two  zero- 
curvature  axes  in  the  hyperbolic  case  and  the  single  zero  curvature  axis  in  the  parabolic 
case  are  isolated,  from  which  the  maximum  and  minimum  curvature  axes  are  immediately 
obtained.  In  the  elliptic  case  the  axes  of  minimum  curvature  is  estimated  by  the  r  that 
minimizes  T„t(r),  and  the  maximum  curvature  axis  is  the  perpendicular  axis. 

Figure  11  shows  classification  results  for  synthetic  data  of  a  torus,  a  cylinder,  a  cone,  a 
hyperbola  and  a  sphere.  All  the  objects  but  the  torus  were  centered  at  (20,20,50)  (in  the  above 
coordinate  system)  with  the  other  parameters  set  to  4.  The  torus  was  centered  at  (20,20,20), 
with  big  radius  8  and  small  radius  4.  The  convergence  angle  of  the  camera  (2/t)  was  30°.  The 
distance  between  both  cameras  and  the  fixation  point  was  150  for  the  torus  and  50  for  the  other 
objects.  The  shadings  are  explained  in  the  legend  of  the  figure.  The  results  are  accurate  both 
for  surface  classification  and  the  directions  of  the  principal  and  the  zero  axes. 


3.4  The  computation  of  a  ID  curvature 

We  have  computed  the  sign  of  the  surface  curvature  at  a  point  by  computing  the  sign  of  the 
curvature  of  curves  whose  tangents  span  all  directions  in  the  tangent  plane  of  the  surface  at  the 
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Figure  11:  First  row,  left:  a  sphere,  middle:  a  torus,  right:  inside  a  torus.  Second  row,  left:  a 
hyperbola,  middle:  a  tilted  cylinder,  right:  a  double  cone.  The  shadings  mean  the  following: 
surface  classification:  the  lightest  grey  marks  hyperbolic  regions  (internal  rings  in  both  toruses, 
the  hyperbola),  darker  shade  of  grey  marks  parabolic  regions  (cylinder  and  cone),  darker  grey 
marks  convex  regions  (sphere,  external  ring  of  torus),  and  the  darkest  grey  marks  concave 
regions  (external  ring  of  inner  torus); 

axes:  white  marks  axes  of  zero-curvature,  grey  marks  axes  of  minimum  curvature  or  maximal 
negative  curvature,  and  black  marks  axes  of  maximal  curvature. 


point.  Each  of  these  curves  was  defined  by  three  f  oints  on  the  surface  and  had  the  property 
that  the  projections  of  the  three  points  in  the  first  image  were  collinear.  In  this  case  the  sign  of 
the  2D  curvature  of  the  projections  of  the  three  points  in  the  other  image  relative  to  the  FOE 
gave  the  sign  of  the  3D  curvature  of  the  3D  curve  on  the  surface.  This  is  also  the  sign  of  the 
nonnal  curvature  at  the  direction  of  the  tangent  to  this  3D  curve. 

This  scheme  can  be  generalized  to  estimate  the  sign  of  the  curvature  (though  not  the  normal 
curvature)  of  other  3D  curves  defined  by  three  points  in  the  two  images.  A  generalized  rule 
would  be  the  following:  let  a  be  the  2D  angle  between  three  points  in  the  first  image  (see 
figure  12).  0  <  a  <  90°  if  the  angle  is  turned  towards  the  FOE  in  that  image  and  90°  <  a  <  180° 
otherwise.  Thus  for  backward  motion,  if  a  increases  from  the  first  image  to  the  second,  the  sign 
of  the  curvature  is  positive,  otherwise  it  is  negative  (figure  12a).  This  generalized  rule  yields 
the  correct  sign  in  many  cases.  Figure  12b  illustrates  the  deterioration  in  performance  when 
the  angle  o  between  the  three  points  in  the  first  image,  which  measures  the  deviation  from 
collinearity,  increases. 


PO  t 


>  POE 


firat  image 


Mcond  image 


•) 


b) 


Figure  12:  a)  an  example  of  the  change  in  the  2D  curvature  of  three  points  originating  from  a 
concave  curve  in  3D  from  one  image  to  the  next.  In  the  upper  and  middle  examples  0  <  a  <  90°, 
in  the  lower  example  90°  <  a  <  180°.  b)  The  generalized  rule  (see  text)  is  not  exact,  its 
performance  deteriorates  with  the  amount  of  deviation  from  collinearity  in  the  first  image  (o 
in  the  text).  In  this  example  the  motion  is  a  translation  of  (10,0,10),  rotation  of  10°  around 
the  X-axis,  rotation  of  -10°  around  the  K-axis,  and  rotation  of  10°  around  the  Z-axis. 


4  Sensitivity  to  errors 

Small  errors  in  the  data  due  to  quantization  errors  in  discrete  data  and  noise  have  quite  devas¬ 
tating  effects  on  the  estimation  of  local  surface  type.  This  is  true  for  any  algorithm,  therefore 
the  data  (either  disparities  or  reconstructed  depth)  has  to  be  substantially  smoothed  before  the 
surface  type  can  be  meaningfully  computed.  To  estimate  the  error  rate  before  smoothing,  I 
compute  the  percent  of  correct  evaluation  of  the  sign  of  the  normal  curvature  at  all  directions 
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over  all  the  surface  (that  is,  at  the  same  data  points  that  were  used  for  the  previous  classification 
examples). 

The  error  rate  is  first  computed  for  the  simple  2D  algorithm  described  in  section  3.  It  is 
compared  to  the  error  rate  of  the  best  alternative  algorithm  (both  before  smoothing:).  This 
algorithm  estimates  the  3D  coordinates  of  a  matched  pair  by  the  point  closest  to  two  3D  rays, 
each  passes  through  one  camera  and  the  projection  of  the  feature  on  its  image.  (These  rays 
ideally  intersect  at  the  exact  location  of  the  feature  in  3D).  The  algorithm  uses  a  perfect  knowl¬ 
edge  of  the  motion  or  camera  parameters,  therefore  the  usually  large  errors  introduced  while 
computing  these  parameters  from  the  noisy  data  itself  are  artificially  avoided.  As  expected, 
when  the  recursive  error  due  to  the  computation  of  the  motion  parameters  from  noisy  data  is 
eliminated,  the  best  exact  algorithm  does  better  than  the  2D  algorithm,  but  not  much  better. 
The  results  of  the  comparison  are  given  in  table  1  for  stereo  and  table  2  for  motion.  Data  is 
given  for  different  objects,  different  resolution  levels  (measured  in  the  number  of  pixels  in  the 
intervals  jj  Oi  -  Oo  ||  or  ||  0 1  -  Oq  ||),  and  different  noise  levels  (where  the  standard  deviation 
is  measured  in  percent  of  the  intervals  ||  0 1  -  Oo  j!  or  ||  O 2  -  0 0  ||). 


object 

resolution 

noise 

SD 

2D  algorithm 

error  rate 
best  3D  algorithm 

difference 

cylinder 

— 

10% 

35% 

5% 

sphere 

— 

10% 

37% 

29% 

8% 

hyperbola 

— 

10% 

32% 

26% 

6% 

hyperbola 

— 

15% 

9% 

6% 

hyperbola 

5 

- — 

26% 

17% 

9% 

torus 

— 

23% 

19% 

4% 

torus 

50 

— 

6% 

3% 

3% 

torus 

— 

10% 

41% 

32% 

9% 

torus 

— 

4% 

26% 

18% 

8% 

Table  1:  Curvature  from  stereo:  the  first  column  gives  the  object  type,  the  second  column 
gives  the  resolution  (see  text)  if  it  is  finite,  and  the  third  column  gives  the  standard-deviation 
of  the  noise  in  percents  (see  text)  if  there  is  any.  The  next  two  columns  give  the  error  rate 
for  the  2D  algorithm  described  in  the  previous  section  and  the  best  3D  algorithm  using  exact 
motion  parameters  (see  text).  The  last  column  gives  the  difference  in  error  rat^s  between  the 
two  algorithms. 

In  the  2D  curvature  from  motion  algorithm,  small  angles  of  curvature  may  be  classified  as 
zero-curvature  when  the  resolution  is  finite.  Such  directions  are  ignored  in  the  computation  of 
the  error  rate.  For  finite  resolution  I  compute  the  error  rate  in  two  cases:  first  subcolumn  in 
table  2  is  the  regular  error  rate  as  before;  second  subcolumn  in  figure  2  is  the  error  rate  if  the 
task  is  performed  with  hyperacuity  that  is  an  order  of  magnitude  better  than  visual  acuity.  If  a 
biological  visual  system  uses  its  ability  to  compute  the  orientation  of  three  points  with  an  order 
of  magnitude  higher  precision  than  visual  acuity  (Vernier  acuity),  then  the  second  subcolumn 
may  give  a  better  comparison  for  its  error  rate  (see  section  5). 
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object 

resolution 

noise 

SI) 

2D  algorithm 
regular  hyperacuity 

error  rate 

best  3D  algorithm 

difference 

cylinder 

10% 

43% 

39% 

4% 

sphere 

— 

10% 

42% 

— 

35% 

7% 

sphere 

10 

— 

28% 

3% 

20% 

8% 

hyperbola 

— 

8% 

43% 

35% 

8% 

hyperbola 

10 

— 

28% 

8% 

22%, 

6% 

torus 

10 

— 

35% 

16% 

32% 

3% 

torus 

— 

28% 

8% 

21% 

7% 

torus 

— 

4% 

41% 

— 

35% 

6% 

torus 

— 

8% 

46% 

- — . 

42% 

4% 

Table  2:  Curvature  from  general  motion:  translation  (10,-10,10)  and  rotation  15°  around  the 
.Y-axis,  —20°  around  the  Y’-axis,  and  5°  around  the  Z-axis.  The  columns  are  as  in  table  1. 
with  a  difference  that  two  error  rates  are  given  for  the  2D  approximate  algorithm  in  the  finite 
resolution  cases  (see  text). 

5  Discussion 

The  curvature  operators  described  in  section  3  can  be  implemented  by  a  biological  system 
with  high  precision.  From  Proposition  1  we  see  that  the  operator  that  gives  the  sign  of  the 
normal  curvature  has  to  check  whether  three  points  are  collinear  or  otherwise  how  the  angle 
between  them  is  oriented.  This  is  an  example  of  a  hyperacuity  task  (see  [20]  pp.  337  for  a 
review),  namely,  the  precision  with  which  it  can  be  done  is  ten  times  higher  than  the  visual 
acuity.  Thus,  the  biological  system  may  be  capable  of  computing  the  sign  of  the  curvature 
directly,  without  recourse  to  an  operator  similar  to  T.  Because  of  the  hyperacuity  resolution, 
the  expected  error  rate,  which  is  already  of  the  same  order  of  magnitude  as  the  error  rate  of 
the  best  3D  algorithm  that  uses  known  motion  parameters,  should  be  significantly  lower  (see 
table  2).  Also,  the  algorithm  that  computes  shape  type  involves  only  line  operators  at  different 
orientations.  This  is  consistent  with  known  biological  architectures. 

Koenderink  and  van  Doom  ([14])  showed  that  some  important  features,  the  sign  of  the 
Gaussian  curvature  for  example,  are  related  to  motion  invariants  of  vector  fields  (e.g.,  shear). 
These  results  are  derived  using  vector  field  analysis  and  therefore  assume  the  existence  of  a 
differentiable  vector  field  (though  singularities  are  addressed  in  [21]).  The  results  are  less  gen¬ 
eral  in  that  the  curvature  is  assumed  to  be  large  relative  to  the  distance  to  the  object,  and 
the  angular  part  of  the  rotation  is  assumed  small.  It  is  also  not  clear  how  the  appropriate 
vector  field  invariants  can  be  computed.  Finally,  the  sign  of  the  Gaussian  curvature  does  not 
provide  a  complete  classification  of  surfaces  with  respect  to  the  viewer  (i.e.,  the  distinction 
cunvex/concave).  I  have  shown  above  that  some  interesting  quantities  (the  sign  of  the  Gaus¬ 
sian  curvature  and  the  absolute  sign  of  the  normal  curvature)  can  be  computed  with  simple 
hyperacuity  detectors  at  different  orientations.  The  analysis  is  exact,  the  only  approximation 
is  in  the  computation  of  the  curvature  of  a  planar  curve  using  discrete  data.  (It  is  interesting 
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to  note  here  that  Koenderink  and  van  Doom  [16]  have  suggested  the  use  of  difference  of  slopes 
of  line  segments  to  approximate  the  shear  of  the  stereo  vector  field.  This  is  in  fact  the  operator 
used  above  (equation  2)  to  determine  the  sign  of  the  normal  curvature.) 

One  can  also  regard  the  2D  algorithm  of  section  3  as  a  wav  to  compute  the  direction  of 
motion:  the  focus  of  expansion  and  t he  direction  of  the  translational  component  of  the  mo¬ 
tion.  The  location  of  the  FOE  is  obviously  important  for  navigation,  and  (the  exact  value  of) 
the  translational  component  of  the  optical  flow  can  give  relative  depth.  Longuet- Higgins  and 
Prazdny  ([7])  have  shown  that  these  quantities  can  be  computed  from  the  optical  flow  and 
described  two  algorithms  to  compute  them.  Their  algorithm  (the  one  using  dense  data)  com¬ 
putes  the  exact  value  of  the  translational  component  of  the  optical  flow,  not  only  its  direction. 
Some  of  its  drawbacks  are  the  following:  it  is  computationally  expensive  and  noise-sensitive; 
it  assumes  that  the  surface  function  is  smooth  enough  so  that  it  can  be  approximated  by  the 
linear  terms  of  X  and  Y;  and  it  is  biologically  implausible.  Altogether,  it  is  given  more  as  an 
existence  proof  that  the  computation  of  the  motion  parameters  and  structure  from  motion  are 
possible  from  images  only.  The  approximate  algorithm  of  section  3  shows  that  if  we  do  not 
require  a  complete  computation  of  the  motion  parameters  then  some  important  features  of  the 
motion  can  be  computed  more  easily  and  in  parallel,  more  reliably,  and  by  a  more  biologically 
plausible  algorithm.  It  can  also  be  used  before  a  more  exact  algorithm  to  obtain  an  initial 
estimate  of  the  location  of  the  FOE  and  the  translational  component  of  the  motion. 

6  Summary 

This  work  has  been  motivated  by  two  observations.  First,  the  computation  of  the  motion 
parameters  or  the  cameras’  calibration  is  generally  complicated,  time  consuming  and  error 
sensitive.  Second,  it  is  not  clear  that  biological  vision  needs  such  a  computation  or  that  it  uses 
the  exact  recovery  of  the  depth  of  a  surface  at  each  point.  From  the  analysis  presented  above 
we  can  conclude  that  the  direct  computation  of  some  interesting  motion  and  shape  invariants 
from  matched  images  may  be  computationally  easier,  more  parallel  in  nature,  and  more  robust 
in  the  presence  of  errors.  More  specifically,  it  has  been  shown  that  the  sign  of  the  Gaussian 
curvature  of  a  surface  patch  can  be  obtained  from  motion  or  stereo  disparities  with  a  simple, 
biologically  plausible,  operator.  The  focus  of  expansion  can  also  be  obtained  from  this  analysis. 
The  surface  can  further  be  classified  as  convex,  concave,  planar,  cylindrical,  or  saddle-point.  If 
a  sufficient  amount  of  interesting  quantities  can  be  computed  in  a  similar  way  (which  depends 
of  course  on  the  goal  of  the  computation),  the  exact  motion  parameters  and  shape  need  not 
be  computed  at  all.  This  may  be  the  case  for  the  limited  purposes  of  biological  vision  like 
recognition  and  navigation. 

7  Appendix 

Following  are  the  proofs  of  the  propositions  in  section  3. 

Proposition  1  Let  Pq  denote  a  point  on  the  surface  of  some  object  whose  projection  in  the 
first  image  is  Oq.  Let  P\  and  P2  denote  two  other  points  on  the  same  surface  whose  projections 
in  the  first  image  are  0\  and  O2,  and  where  Oo,  0 1  and  02  are  coltinear.  Let  Oo,  0 1,  and  O2 
be  the  projections  of  the  same  three  points  in  a  second  image.  Assume  the  motion  is  backward 
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(away  from  the  focus  of  expansion).  Then  the  sign  of  the  normal  curvature  of  the  curve  £ 
passing  through  Pq,  P\-  nn(t  can  be  determined  as  follows: 

•  if  tin  smaller  angle  through  Oq,  0 1  and  Os  is  turned  towards  the  focus  of  expansion  then 
the  normal  curvature  of  £  is  positive  (see  figure  5a). 

•  if  Oq.  0\  and  0 2  are  collinear  then  the  normal  curvature  of  £  is  0  (see  figure  5b). 

•  if  the  smaller  angle  through  Oq,  0 1  and  02  is  turned  away  from  the  focus  of  expansion 
then  the  normal  curvature  of  (  is  negative  (see  figure  5c). 

In  forward  motion  (towards  the  focus  of  expansion),  the  interpretation  of  the  angle  is  reversed. 
(The  motion  of  the  coordinate  system  is  defined  to  be  a  rotation  followed  by  a  translation.) 

Proof: 

Let  wg  denote  the  tangent  of  the  curve  £  whose  sign  of  curvature  we  want  to  estimate.  The 
tangents  to  all  the  curves  on  the  surface  passing  through  Pq  must  lie  in  the  tangent  plane  at  PQ 
ar  some  angle  Q>  (e.<r..  in  figure  13).  The  normal  curvature  of  any  curve  with  tangent  wg  is 
equal  to  the  exact  curvature  of  a  single  curve  with  tangent  wg.  This  curve  is  the  intersection  of 
the  normal  section,  the  plane  through  wg  and  the  normal,  with  the  surface  (see  figure  13).  Let 
ug  be  the  intersection  of  the  normal  plane  and  the  surface.  It  is  therefore  sufficient  to  compute 
the  curvature  of  ug  to  obtain  the  normal  curvature  of  £.  Let  ng  denote  the  curvature  of  ug. 

Consider  the  lower  part  of  figure  13.  Let  N0  be  some  arbitrary  axis  through  P0  that  creates  a 
sharp  angle  with  ,V  (that  is,  /V  .jV0  >  0).  We  define  an  IV^-section  in  a  similar  way  to  the  normal 
section:  it  is  the  plane  that  passes  through  N0  and  the  tangent  line  wg.  The  corresponding 
,V0-section  intersects  the  surface  at  a  curve  u°g.  Let  n°g  be  the  curvature  of  ug,  ng  lies  in  the 
AVsection.  Since  n°g  is  perpendicular  to  wg ,  it  lies  along  the  projection  of  N  on  the  ,V0-section, 
either  in  the  direction  of  N  or  -N .  Since  the  angle  between  N  and  N0  is  sharp,  so  is  the  angle 
between  N0  and  the  projection  of  N  on  the  iV0-section.  Thus  the  sign  of  n°e  with  respect  to  N0 
(the  sign  of  n°9  ■  N0 )  is  equal  to  its  sign  with  respect  to  the  projection  of  N  on  the  iV0-section. 
This,  in  turn,  has  the  same  sign  as  its  sign  with  respect  to  N  (the  sign  of  ng  ■  N),  which  is  the 
sign  of  the  normal  curvature.  Therefore  the  sign  of  n°g  with  respect  to  N0  (the  sign  of  ng  •  N0) 
is  equal  to  the  sign  of  the  normal  curvature  corresponding  to  wg  (the  sign  of  rig  ■  N). 

The  argument  reverses  when  applied  to  an  axis  NQ  that  creates  an  obtuse  angle  with  N 
(that  is,  iV  ■  i\0  <  0).  It  will  break  down  if  N  and  N0  are  perpendicular  (N  ■  N0  =  0),  a  case 
for  which  the  proposition  does  not  hold. 

The  first  image  is  depicted  in  figure  14.  We  choose  axis  N0  to  be  the  line  of  sight,  the  line 
connecting  Po  and  the  first  camera.  By  definition  the  normal  creates  a  sharp  angle  with  the 
line  of  sight  unless  it  is  a  boundary  where  the  two  lines  are  perpendicular.  For  a  given  wg,  the 
corresponding  ,V0-section  (marked  in  figure  14  with  continuous  lines)  includes  Pq,  P\  and  P2 
(three  points  on  the  surface  as  we  have  defined  before),  Oq,  Oj  and  02  (their  projections  on 
the  first  image),  and  the  camera’s  pinhole.  The  curve  u°g  is  the  line  passing  through  Po,  P\  and 
P2.  We  define  n°g  to  be  the  angle  bisector  of  u,  the  angle  defined  by  Pj,  P0  and  P2.  1  (Thus 

‘This  definition  can  be  justified  in  the  following  way.  The  direction  of  the  normal  to  a  plane  curve  at  some 
point  Po  is  the  radius  of  the  circle  of  curvature,  which  is  the  limit  of  a  circle  through  Po  and  two  neighboring 
points  P\  and  Pi  as  they  approach  Po.  If  some  fixed  P\  and  Pi  are  equidistant  to  Po,  the  radius  of  the  circle 
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Figure  13:  Illustration  of  the  normal  section  (u] 
text. 


2 


Figure  14:  The  .V0-section  of  a  plane  containing  Pi,  Pq ,  P2  and  the  first  camera. 

we  can  think  of  as  a  smooth  curve  passing  through  Pq,  P\  and  P2  whose  tangent  at  Pc  is 
the  line  perpendicular  to  the  angle  bisector  of  u.)  From  the  above  discussion  the  sign  of  the 
normal  curvature  is  determined  by  whether  v  is  turned  “towards”  the  camera  or  “away”  from 
it.  Let  Pq  be  the  intersection  of  the  line  of  sight  and  the  line  through  Pi  and  P2  in  the  plane 
of  the  Ar0-section  (see  figure  14).  Then  the  question  is  whether  Pq  is  between  the  camera  and 
Pq  or  on  the  other  side  of  Po. 

Perspective  projection  of  the  A0-section,  specifically  Po,  Pi,  P2,  Pq  and  the  camera’s  pinhole, 
preserves  order  if  all  points  lie  in  the  half  space  that  is  in  the  field  of  view  of  the  projection 
(or  the  other  half  space).  Assume  that  the  plane  is  not  projected  to  a  line,  that  is,  the  second 
camera  is  not  translating  on  the  lV0-section,  for  which  case  the  analysis  does  not  hold.  Thus 
the  question  is  whether  the  projection  of  Pq  is  between  the  projections  of  the  camera’s  pinhole 
and  Po  or  on  the  other  side  of  the  projection  of  Pq.  We  choose  the  perspective  projection  on 
the  second  image,  where  the  P,’s  are  projected  to  0,'s  respectively,  and  the  camera  is  projected 
to  the  focus  of  expansion.  Thus  if  Oq  is  between  the  FOE  and  Oq  then  the  normal  curvature  is 
positive,  and  if  O °  is  on  the  other  side  of  Oq  then  the  curvature  is  negative.  If  Oq  =  Oq  then 
P0,  P\  and  P2  are  collineai  and  the  normal  curvature  is  0.  This  completes  the  proof  for  the 
backward  motion  since  then  Po,  Pj,  P2  and  the  camera  are  all  in  the  field  of  view  of  the  second 
camera.  If  the  motion  is  forward  then  the  first  camera  is  not  in  the  field  of  view  of  the  second 
camera.  The  axis  N0  (the  line  of  sight)  is  projected  discontinuously  and  therefore  the  meaning 

passing  through  Pi,  Po  and  P2  is  also  the  angle  bisector  of  the  angle  between  them  v.  Thus  the  angle  bisector 
serves  as  a  discrete  estimator  for  the  direction  of  the  normal  given  two  points  like  difference  operators  serve  to 
approximate  derivatives. 
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of  the  angle  through  the  projections-  of  P\,  Pq  and  P2  reverses. 


Proposition  2  Let  0,  as  before  where  0\  and  O2  are  chosen  on  different  sides  of  Oq.  Assume 
backward  motion  ( the  motion  is  defined  note  as  a  translation  followed  by  a  rotation).  Let 
().  —  {xt.!j,)  denote  the  projections  of  Pt  respectively  in  a  second  image,  as  before.  Let  T  = 
—  ll  Zj°  -  If  Oi  ‘s  chosen  such  that  the  angle  through  0\,  Oq  and  the  FOE  going  clockwise 
is  smaller  than  ISO0,  that  is,  0\  is  below  the  sign-bisector  in  figure  15a,  then  the  sign  of  T 

equals  the  sign  of  the  normal  curvature  of  (j.  If  0\  is  chosen  so  that  the  angle  is  larger  than 

ISO0  then  the  sign  of  T  is  opposite  to  the  sign  of  the  normal  curvature  off.  If  the  angle  equals 

IS0J  then  the  sign  of  T  is  identically  0. 


Proof: 

From  the  previous  proposition,  the  sign  of  the  normal  curvature  is  determined  by  whether  0 q 
is  between  t lie  FOE  (the  projection  of  the  camera)  and  Oq  or  on  the  other  side  of  Oq  (see 
figure  15b). 


first  image 


second  image 


a) 


b) 


Figure  15:  The  perspective  projection  of  .VQ-section  assuming  P\,  Pq ,  Pi  and  the  camera  are 
on  the  same  field  of  view:  a)  first  image,  b)  second  image. 

From  its  definition  T  =  tan  ^2  -  tanjt  (figure  lob).  We  know  that  O]  and  0 2  lie  on 
different  sides  of  the  sign-bisector.  Assume  for  simplicity  that  0 1  and  0 2  lie  on  different  sides 
of  a  parallel  to  the  }'-axis  through  Oq  (figure  15b).  If  0\  is  below  the  sign-bisector  in  figure  15b 
then  the  sign  of  T  is  positive  iff  Oq  is  between  the  FOE  and  Oo  and  negative  iff  Oq  is  on  the 
other  side  of  Oq.  That  is,  the  sign  of  T  is  equal  to  the  sign  of  the  normal  curvature  of  f  if  the 
angle  through  Oj,  Oo  and  the  FOF.  going  clockwise  is  smaller  than  ISO0.  We  have  used  the 
previous  proposition  for  backward  motion  when  the  motion  is  defined  as  rotation  followed  by 
translation.  If  the  motion  is  redefined  as  translation  followed  by  rotation,  and  backward  motion 
is  again  assumed,  then  this  condition  is  equivalent  to  the  following:  the  sign  of  T  is  equal  to 


the  sia.ii  of  the  normal  curvature  of  f  if  the  angle  through  0\.  Oq  and  the  FOE  going  clockwise 
is  smaller  than  ISO0.  In  a  similar  way.  the  sign  of  T  is  the  opposite  of  the  sign  of  the  normal 
curvature  of  ^  if  the  angle  through  0\.  Oo  and  the  FOE  going  clockwise  is  larger  than  ISO0.  If 
fim  angle  through  O i,  Oo  and  the  FOE  equals  ISO0,  P0,  I\ .  P>  and  the  camera  are  collinear 
and  therefore  T  =  0.  This  completes  the  proof  of  the  proposition. 

When  Oi  and  are  both  on  the  same  side  of  a  parallel  to  the  E-axis  through  Oo  the 
problem  can  be  easily  fixed.  This  case  is  detected  when  the  sign  of  x 2  -  x()  equals  the  sign  of 
ji  —  xq.  It  is  sufficient  to  push  either  0\  or  Oo  to  be  almost  parallel  to  the  E-axis  on  the  other 
side  (±oc).  Usually,  though,  the  combined  use  of  T  and  T-1  eliminates  the  problem. 
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